AITopics | acc 1

Collaborating Authors

acc 1

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Clip_Dataset__NeurIPS2022_ (10)

Thao Nguyen

Neural Information Processing SystemsFeb-10-2026, 11:59:48 GMT

artificial intelligence, machine learning, withalldistributionshift, (18 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.31)

Add feedback

SpecTUS: Spectral Translator for Unknown Structures annotation from EI-MS spectra

Hájek, Adam, Hecht, Helge, Price, Elliott J., Křenek, Aleš

arXiv.org Artificial IntelligenceFeb-7-2025

Compound identification and structure annotation from mass spectra is a well-established task widely applied in drug detection, criminal forensics, small molecule biomarker discovery and chemical engineering. We propose SpecTUS: Spectral Translator for Unknown Structures, a deep neural model that addresses the task of structural annotation of small molecules from low-resolution gas chromatography electron ionization mass spectra (GC-EI-MS). Our model analyzes the spectra in \textit{de novo} manner -- a direct translation from the spectra into 2D-structural representation. Our approach is particularly useful for analyzing compounds unavailable in spectral libraries. In a rigorous evaluation of our model on the novel structure annotation task across different libraries, we outperformed standard database search techniques by a wide margin. On a held-out testing set, including \numprint{28267} spectra from the NIST database, we show that our model's single suggestion perfectly reconstructs 43\% of the subset's compounds. This single suggestion is strictly better than the candidate of the database hybrid search (common method among practitioners) in 76\% of cases. In a~still affordable scenario of~10 suggestions, perfect reconstruction is achieved in 65\%, and 84\% are better than the hybrid search.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2502.05114

Country:

Europe > Czechia > South Moravian Region > Brno (0.04)
Europe > Germany > Berlin (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.87)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.89)

Add feedback

Confidence v.s. Critique: A Decomposition of Self-Correction Capability for LLMs

Yang, Zhe, Zhang, Yichang, Wang, Yudong, Xu, Ziyao, Lin, Junyang, Sui, Zhifang

arXiv.org Artificial IntelligenceDec-27-2024

Large Language Models (LLMs) can correct their self-generated responses, but a decline in accuracy after self-correction is also witnessed. To have a deeper understanding of self-correction, we endeavor to decompose, evaluate, and analyze the self-correction behaviors of LLMs. By enumerating and analyzing answer correctness before and after self-correction, we decompose the self-correction capability into confidence (being confident to correct answers) and critique (turning wrong answers to correct) capabilities, and propose two metrics from a probabilistic perspective to measure these 2 capabilities, along with another metric for overall self-correction capability evaluation. Based on our decomposition and evaluation metrics, we conduct extensive experiments and draw some empirical conclusions. For example, we find different models can exhibit distinct behaviors: some models are confident while others are more critical. We also find the trade-off between the two capabilities (i.e. improving one can lead to a decline in the other) when manipulating model self-correction behavior by prompts or in-context learning. Further, we find a simple yet efficient strategy to improve self-correction capability by transforming Supervision Fine-Tuning (SFT) data format, and our strategy outperforms vanilla SFT in both capabilities and achieves much higher accuracy after self-correction. Our code will be publicly available on GitHub.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2412.19513

Country: North America > United States > Minnesota (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.97)

Add feedback

Evaluating Mathematical Reasoning of Large Language Models: A Focus on Error Identification and Correction

Li, Xiaoyuan, Wang, Wenjie, Li, Moxin, Guo, Junrong, Zhang, Yang, Feng, Fuli

arXiv.org Artificial IntelligenceJun-2-2024

The rapid advancement of Large Language Models (LLMs) in the realm of mathematical reasoning necessitates comprehensive evaluations to gauge progress and inspire future directions. Existing assessments predominantly focus on problem-solving from the examinee perspective, overlooking a dual perspective of examiner regarding error identification and correction. From the examiner perspective, we define four evaluation tasks for error identification and correction along with a new dataset with annotated error types and steps. We also design diverse prompts to thoroughly evaluate eleven representative LLMs. Our principal findings indicate that GPT-4 outperforms all models, while open-source model LLaMA-2-7B demonstrates comparable abilities to closed-source models GPT-3.5 and Gemini Pro. Notably, calculation error proves the most challenging error type. Moreover, prompting LLMs with the error types can improve the average correction accuracy by 47.9\%. These results reveal potential directions for developing the mathematical reasoning abilities of LLMs. Our code and dataset is available on https://github.com/LittleCirc1e/EIC.

acc 1, error type, explanation, (17 more...)

arXiv.org Artificial Intelligence

2406.00755

Country:

Europe > United Kingdom > England > Berkshire > Windsor (0.04)
Asia > Singapore (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Worst Case Matters for Few-Shot Recognition

Fu, Minghao, Cao, Yun-Hao, Wu, Jianxin

arXiv.org Artificial IntelligenceJul-24-2022

Few-shot recognition learns a recognition model with very few (e.g., 1 or 5) images per category, and current few-shot learning methods focus on improving the average accuracy over many episodes. We argue that in real-world applications we may often only try one episode instead of many, and hence maximizing the worst-case accuracy is more important than maximizing the average accuracy. We empirically show that a high average accuracy not necessarily means a high worst-case accuracy. Since this objective is not accessible, we propose to reduce the standard deviation and increase the average accuracy simultaneously. In turn, we devise two strategies from the bias-variance tradeoff perspective to implicitly reach this goal: a simple yet effective stability regularization (SR) loss together with model ensemble to reduce variance during fine-tuning, and an adaptability calibration mechanism to reduce the bias. Extensive experiments on benchmark datasets demonstrate the effectiveness of the proposed strategies, which outperforms current state-of-the-art methods with a significant margin in terms of not only average, but also worst-case accuracy. Our code is available at https://github.com/heekhero/ACSR.

acc 1, accuracy, computer vision, (11 more...)

arXiv.org Artificial Intelligence

2203.06574

Country:

North America > Canada > Ontario > Toronto (0.14)
Asia > China > Jiangsu Province > Nanjing (0.04)
North America > United States > California (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback